-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add igdb and openlibrary imports #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughAdds IGDB and OpenLibrary service clients, Twitch OAuth environment vars, multi-source import UI (TMDB/IGDB/OpenLibrary), HTMX search endpoints, unified import/cover download flow, and renames TMDB poster fields to cover equivalents. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Browser
participant ImportView as Media Import/Edit View
participant TMDB
participant IGDB
participant OpenLibrary
User->>Browser: Select source tab & submit query
Browser->>ImportView: HTMX search request (tmdb/igdb/openlibrary)
ImportView->>TMDB: search_multi(query) / TMDB API
ImportView->>IGDB: search_games(query) / IGDB API (via Twitch OAuth)
ImportView->>OpenLibrary: search_books(query) / OpenLibrary API
TMDB-->>ImportView: results (title, year, cover_url, contributors)
IGDB-->>ImportView: results (title, year, cover_url, contributors)
OpenLibrary-->>ImportView: results (title, year, cover_url, contributors)
ImportView-->>Browser: Render suggestions partial into `#import-results`
User->>Browser: Click result to import (tmdb_id/igdb_id/openlibrary_key)
Browser->>ImportView: GET media_edit with external id
ImportView->>TMDB: fetch details /cover URL (or)
ImportView->>IGDB: fetch details /cover URL (or)
ImportView->>OpenLibrary: fetch details /cover URL
TMDB-->>ImportView: detailed metadata
IGDB-->>ImportView: detailed metadata
OpenLibrary-->>ImportView: detailed metadata
ImportView->>ImportView: _download_cover(cover_url) -> calls TG/IG/OL clients' download_cover
ImportView->>Browser: Render pre-filled edit form with import_data, import_contributors, import_tags
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@src/core/services/igdb.py`:
- Around line 145-151: The Apicalypse query currently embeds raw user input into
the body variable (where the Apicalypse statement is built) which can break the
query; add an escaping helper (e.g., escape_apicalypse_string) and call it
before interpolating query into body so that backslashes and double quotes are
escaped (replace "\" with "\\\\" and """ with "\\\""), also strip or escape
newlines/control chars as needed; update the code that constructs body to use
the escaped value instead of raw query.
In `@src/core/services/tmdb.py`:
- Around line 198-215: The domain check in download_cover is weak (it uses
substring containment on cover_url), so update download_cover to validate the
URL using a proper startswith check against the TMDB image base (e.g.,
"https://image.tmdb.org" or a TMDB_IMAGE_BASE_URL constant) before attempting
requests.get; keep the early-return/logger.warning behavior on invalid URLs and
retain the existing exception handling with logger.exception and
requests.RequestException to avoid changing error semantics.
In `@src/core/views.py`:
- Around line 133-151: The _download_cover function currently routes by
substring and must instead parse the URL with urllib.parse.urlparse, validate
that the scheme is https (or allowed schemes) and that the parsed hostname
exactly equals "image.tmdb.org", "images.igdb.com", or "covers.openlibrary.org"
before calling the respective client; update _download_cover to use urlparse and
exact hostname checks and return None for any mismatch. Likewise update the TMDB
client method in src/core/services/tmdb.py (the method that currently uses a
substring check to decide if a URL is allowed) to parse the URL, verify scheme
and that parsed.hostname == "image.tmdb.org" (or other allowed TMDB hostnames if
applicable) before proceeding. Ensure both places reject URLs with hostnames
like "image.tmdb.org.attacker.com" and only accept exact hostnames.
In `@src/templates/partials/igdb/igdb_suggestions.html`:
- Around line 36-40: The template condition in igdb_suggestions.html ({% elif
query %}) never fires because the view doesn't include the query in the context;
update the igdb_search_htmx view (and likewise tmdb_search_htmx) to add the
extracted query into base_context so every return path (short-query early
return, API error, and successful search) sets base_context = {"results": [],
"media_id": media_id, "query": query} (or the successful results list) before
rendering the template; verify the templates igdb_suggestions.html and
tmdb_suggestions.html now receive and use the query variable.
In `@src/templates/partials/openlibrary/openlibrary_suggestions.html`:
- Around line 43-47: The template block checking {% elif query %} will never run
because the view openlibrary_search_htmx does not include a query variable in
the context; update the openlibrary_search_htmx view to pass the original search
string (e.g., variable named query or search_query) into the template context
alongside the search results and error flags so the template can evaluate {%
elif query %} correctly; ensure the context key matches the template's name
"query" or change the template to use the chosen key consistently.
🧹 Nitpick comments (1)
src/templates/base/media_import.html (1)
16-36: Optional: de-duplicate tab switch logic.Line 23/29/35 repeats the same DOM toggling and results clearing; a small helper would reduce copy‑paste risk when adding new sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/core/services/openlibrary.py`:
- Around line 223-230: The loop that extracts author keys from work_data
(iterating over author_ref from work_data.get("authors", [])) assumes
author_ref["author"] is a dict and calls author_ref["author"].get("key"), which
can raise AttributeError for string/None payloads; update the logic in that loop
(around variables author_ref and author_key) to first check types safe: if
author_ref is a dict then check if author_ref.get("author") is a dict before
calling .get("key"), otherwise fall back to author_ref.get("key") or None, so
author_key is only assigned from a dict access when safe and the import will not
abort on non-dict author payloads.
- Around line 32-79: The current _is_valid_genre function uses any(pattern in
subject_lower for pattern in EXCLUDED_SUBJECT_PATTERNS) which incorrectly
filters out valid multi-word genres like "science fiction"; change the exclusion
logic so that single-word exact matches (e.g., "fiction", "juvenile",
"children") are checked with equality while preserving substring checks for
multi-word patterns (e.g., "in fiction", "nyt:", "spanish language"); update
_is_valid_genre to first normalize subject_lower then: if subject_lower in
EXCLUDED_EXACT_SET: return False; if any(pattern in subject_lower for pattern in
EXCLUDED_CONTAINS_LIST): return False (or implement the equivalent conditional
branching using the existing EXCLUDED_SUBJECT_PATTERNS by special-casing
"fiction"), keeping references to EXCLUDED_SUBJECT_PATTERNS and MAX_GENRE_LENGTH
and ensuring the rest of _is_valid_genre (comma/person-name handling) remains
unchanged.
♻️ Duplicate comments (1)
src/core/views.py (1)
133-149: Harden cover URL routing against host spoofing.Substring checks allow bypasses like
https://image.tmdb.org.attacker.com/.... Parse the URL and compare exact hostname + scheme before delegating to clients.
🧹 Nitpick comments (2)
src/core/services/igdb.py (2)
29-30: Module-level token cache may cause issues in concurrent scenarios.The global
_token_cachedict works for single-instance deployments but could lead to race conditions if multiple threads attempt to refresh the token simultaneously. Consider using thread-safe primitives if this becomes an issue.Potential thread-safe approach
+import threading + # Cache for access token (simple in-memory cache) -_token_cache: dict = {"access_token": None, "expires_at": 0} +_token_cache: dict = {"access_token": None, "expires_at": 0} +_token_lock = threading.Lock()Then wrap token refresh logic with
with _token_lock:in_get_access_token.
104-109: Consider preserving exception context.Using
raise IGDBError(msg) from Nonesuppresses the original exception chain, which can make debugging harder. Consider usingraise IGDBError(msg) from eto preserve the original cause.Proposed fix
- except requests.RequestException: + except requests.RequestException as e: logger.exception("Failed to get Twitch access token") msg = "Failed to authenticate with Twitch" - raise IGDBError(msg) from None + raise IGDBError(msg) from e
No description provided.